AITopics | cost model

Enhancing Safety in Reinforcement Learning with Human Feedback via Rectified Policy Optimization

Neural Information Processing SystemsJun-17-2026, 13:27:04 GMT

Balancing helpfulness and safety (harmlessness) is a critical challenge in aligning large language models (LLMs). Current approaches often decouple these two objectives, training separate preference models for helpfulness and safety, while framing safety as a constraint within a constrained Markov Decision Process (CMDP) framework. This paper identifies a potential issue when using the widely adopted expected safety constraints for LLM safety alignment, termed "safety compensation", where the constraints are satisfied on expectation, but individual prompts may trade off safety, resulting in some responses being overly restrictive while others remain unsafe. To address this issue, we propose Rectified Policy Optimization (RePO), which replaces the expected safety constraint with critical safety constraints imposed on every prompt. At the core of RePO is a policy update mechanism driven by rectified policy gradients, which penalizes the strict safety violation of every prompt, thereby enhancing safety across nearly all prompts. Our experiments demonstrate that RePO outperforms strong baseline methods and significantly enhances LLM safety alignment.

large language model, machine learning, reinforcement learning, (21 more...)

Neural Information Processing Systems

Country: North America > United States > Arizona (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.67)

Industry:

Leisure & Entertainment > Sports > Baseball (1.00)
Health & Medicine > Therapeutic Area (1.00)
Media (0.67)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

AMP: Automatically Finding Model Parallel Strategies with Heterogeneity Awareness

Neural Information Processing SystemsApr-25-2026, 06:15:25 GMT

Scaling up model sizes can lead to fundamentally new capabilities in many machine learning (ML) tasks. However, training big models requires strong distributed system expertise to carefully design model-parallel execution strategies that suit the model architectures and cluster setups. In this paper, we develop AMP, a framework that automatically derives such strategies. AMP identifies a valid space of model parallelism strategies and efficiently searches the space for high-performed strategies, by leveraging a cost model designed to capture the heterogeneity of the model and cluster specifications. Unlike existing methods, AMP is specifically tailored to support complex models composed of uneven layers and cluster setups with more heterogeneous accelerators and bandwidth. We evaluate AMP on popular models and cluster setups from public clouds and show that AMP returns parallel strategies that match the expert-tuned strategies on typical cluster setups. On heterogeneous clusters or models with heterogeneous architectures, AMP finds strategies with 1.54 and 1.77 higher throughput than state-of-the-art model-parallel systems, respectively.

artificial intelligence, machine learning, optimization problem, (20 more...)

Neural Information Processing Systems

Genre: Research Report (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

becd02b89259774da2ede23116a80648-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 21:23:19 GMT

artificial intelligence, machine learning, prediction, (19 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.67)
Banking & Finance (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Data Science (0.67)
Information Technology > Software (0.67)

Add feedback

d1a14493e5f84d6c6129414f0cd1a7c6-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 06:29:50 GMT

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning to Optimize Tensor Programs

Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy

Neural Information Processing SystemsFeb-13-2026, 15:15:05 GMT

Neural Information Processing Systems http://nips.cc/

latexit sha1, npruyln66 puozamtmm3tsfgc5w, optimization, (15 more...)

Neural Information Processing Systems

Country:

Africa > Mali (0.05)
North America > United States > New York > New York County > New York City (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Towards Improved Safety Alignment of LLM via a Human-Preference Dataset Jiaming Ji

Neural Information Processing SystemsFeb-11-2026, 14:52:47 GMT

Warning: this paper contains example data that may be offensive or harmful.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Europe > Germany (0.04)
Asia > China > Beijing > Beijing (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Health Care Providers & Services (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

b14680dec683e744ada1f2fe08614086-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-9-2026, 21:02:37 GMT

algorithm, cost model, graph, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.37)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.32)

Add feedback

2b4bfa1cebe78d125fefd7ea6ffcfc6d-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 02:05:31 GMT

bandwidth, cost model, heterogeneity, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Learning to Optimize Tensor Programs

Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy

Neural Information Processing SystemsNov-20-2025, 18:07:08 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, latexit sha1, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Africa > Mali (0.05)
North America > United States > New York > New York County > New York City (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Common Concerns 2 Novelty

Neural Information Processing SystemsNov-19-2025, 00:22:20 GMT

We thank all reviewers for their valuable comments. We address the concerns raised by them below. The idea of using imitation learning to make approximate decisions is not new. The author needs to provide a wall-clock time cost comparison of different methods. We will include them in the final verision.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.33)

Add feedback

Filters

Collaborating Authors

cost model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Enhancing Safety in Reinforcement Learning with Human Feedback via Rectified Policy Optimization

AMP: Automatically Finding Model Parallel Strategies with Heterogeneity Awareness

becd02b89259774da2ede23116a80648-Paper-Conference.pdf

d1a14493e5f84d6c6129414f0cd1a7c6-Paper-Conference.pdf

Learning to Optimize Tensor Programs

Towards Improved Safety Alignment of LLM via a Human-Preference Dataset Jiaming Ji

b14680dec683e744ada1f2fe08614086-AuthorFeedback.pdf

2b4bfa1cebe78d125fefd7ea6ffcfc6d-Paper-Conference.pdf

Learning to Optimize Tensor Programs

Common Concerns 2 Novelty